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© Human proinsulin and analogs thereof and method of preparation by microbial polypeptide expression and conversion 
thereof to human insulin. 



© Microbial expression of a chimeric gene is used to 
procjce e polypeptide compris-ng the am.no acid »ouence 
of n-man pro:nsuUn. or an analog thereof differing in the "C" 
chain portion. A polypeptide so produced contains a se- 
quence of adattionai amino acid units sufficient in number to 
protect it from bacterial proteases, and has a cleavage site 
e.g. a methionine residue aojacent the sequence of ammo 
acid uni;s corresponaing to the proinsulin or proinsulin 

«N analog. Cleavage at this *ue (e.g. bv CNBr) generates 

^ proinsulin (or the analog) which is treated in vitro to form the 
disulfide bonds between the "A" and "B" chain proteins 

\S) characteristic of human insulin. The "C" chain portion is then 

^ excised enzymatical'v to yield human insulin useful e.g. in 

G) the treatment of diabetes. 

The chimeric gene may be aynthesised from oligonu- 

m cieotides and inserted into a plasmid which is used to 

^ transform a host cell. e.g. £. eoti. 
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HUHAN PROINSULIN AND ANALOGS THE REOr 
ANO METHOD OF PREPARATION BY MICROBIAL 
POLYPEPTIDE EXPRESSION AN 0 
CONVERSION THEREOF TO HUMAN INSULIN 

Related Applications 

This application is related to and incorporates by 
reference the disclosures of European Patent Application 
Publications Nos. 0001930 (A.N. 78300597.8) and 0036776 
(A.N. 81301227.5). 

Field of the Invention 

This invention relates to microbial expression of 
polypeptides. In one aspect, it relates to the preparation of 
genes for the oil c rob 1 a 1 1 y expressible production of 
i n termedt a tes useful in the preparation of human insulin. In 
another aspect, 1t relates to the preparation of human 
proinsulin or analogs thereof differing from human proinsulin 
1n the.'C" chain portion. In yet another aspect, it relates to 
the preparation of human insulin from the prepared human 
proinsulin or an analog thereof. 
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Background of the Invention 

Diabetes, the human condition characterized by a failure of 
the pancreas to generate the polypeptide hormone insulin in 
sufficient quantities, in severe cases at least, is currently 
treated by injection of insulin. derived from the pancreas of 
slaughtereS animals. Bovine and porcine insulin, in 
particular, are used for this purpose. 

The use of Insulin derived from animals is unsatisfactory 
fro'r, at leist two standpoints. In the first place, the 
extraction of insulin from the pancreas of slaughtered animals 
is a complex process that requires large quantities of the 
organs. Secondly, and more importantly from the diabetic's 
point of view, the insulin derived from animal sources is not 
chemically identical to human Insulin, differing in the 
sequence of peptide units. Furthermore, It sometimes contains 
non-homologous animal hormones, such as the corresponding 
proinsulin, albeit In snail quantities. As a result, the 
response of patients treated with animal derived insulin is not 
as satisfactory as desired. For example, an Immune response to 
animal insulin is believed to^>e a source of chronic 
complications in certain treatments of diabetes. 

Accordingly, there has gone unfilled a long felt need to 
have a source of insulin identical chemically to human insulin, 
uncontani nated by other biologically active impurities, in 
amounts sufficient to permit diabetics to be treated 
economically. Complicating this task is the complex chemical 
structure of human insulin. Structurally It has two 
polypeptide chains referred to as the "A" and "B" chains bound 
to each other by disulfide bonds. The A chain, some 21 amino 
acid units in length, is bound (crossl inked) to the B chain, a 
chain of 30 amino acids, through disulfide bonds between units 
of the amino acid cysteine in each chain. 
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To be maximally effective in humans, the amino acid units 
of insulin must be precisely ordered to correspond to that 
produced in vivo . However, the complexity of the molecule is 
such that conventional methods of chemical synthesis are 
unsuitec to its preparation, on a commercial scale at least. 

Insulin is produced Jjn vivo in the pancreas in the form of 
prepro insulin. Preproinsulin Is a polypeptide comprising the 
21 units of the A chain, the 30 units of the B chain, a 
bridging or connecting chain of 35 units referred to as the C 
chain and a 24 amino acid "presequence" (Met Ala Leu Trp Met 
Arg Leu Leu Pro Leu Leu Ala Leu Leu Ala Leu Trp Gly Pro Asp Pro 
Ala Ala Ala) attached to the N-termtnal phenylalanine amino 
acid beginning the B chain. Proinsulln, lacking the 
presequence is shown in Figure 1 with a methionine amino acid 
in place of the presequence. This presequence may participate 
in secretion from the cells in which ft is produced. As the 
preproinsulin is excreted from the islet cells on the pancreas, 
tne presequence is excised to leave the proinsulln chain. This 
chain folds to a structure in which three disulfide bonds are 
formed, two of which are between the A and B chain segments of 
the proinsulln. The connecting C chain is then excised 
proteol y tical 1 y to leave a residue which is insulin, consisting 
of the A and B chains bound together by the disulfide bonds. 

This application describes a method for obtaining human 
insulin and human proinsulin and analogs thereof which differ 
from human proinsulin in the sequence of amino acids making up 
the C chain. The method utilizes the burgeoning recombinant 
ON A technology. The following discussion of elements of the 
technology provide background to the detailed description of 
the invention. 

With the advent of recombinant DMA technology, the 
controlled bacterial production of useful polypeptides has 
become possible. Already In hand are bacteria modified by this 
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technology to permit the production of such polypeptide 
products as somatostatin ( K . Itakura e_t a_L , Science J_98, 1056 
(1977), the (component) A and 8 chains of human insulin (O.V. 
Gceddel et a_K , Proc. Hat'l. Acad. Sci. USA 76. 1 06 ( 1979 )) anc 
human growth hormone (O.V. Goeddel tt a_K , Nature IE 1 , 54< 
(1979)). Such ij the power of the technology that virtually 
any useful polypeptide may be bacterially produced, putting 
within reach the controlled manufacture of hormones, enzymes, 
antioodies, and vaccines useful against a wide variety of 
diseases. The cited materials, which describe in greater 
detail the representative examples referrtti to above, are 
incorporated herein by reference, as are other publications 
referred to infra , to illuminate the background of the 
invention. 

The work horse of recombinant ON A technology is the 
ptasiaid, an ex tr a-ch ronosoma 1 loop of double-stranded DNA found 
in bacteria, oftentimes in multiple copies per bacterial cell. 
Included in the information encoded in the plasmid ONA is that 
required to reproduce the plasmid in daughter cells {i.e., a 
"replicon") and ordinarily, one or more selection 
characteristics, such as resistance to antibiotics, which 
permit clones of the host cell containing the plasmid of 
interest to be recognized and preferentially grown in selective 
media. The utility of plasmids, which can be recovered and 
isolated from the host microorganism, lies in the fact that 
they can be specifically cleaved by one or another restriction 
endonuclease or 'restriction enzyme", each of which recognizes 
a different site on the plasmidic ONA. Thereafter heterologous 
genes or gene fragments may be inserted into the plasmid by 
endwise joining at the cleavage site or at reconstructed ends 
adjacent the cleavage site. 



As used herein, the term "he terol ogous" refers to a gene 
not ordinarily found in, or a polypeptide sequence ordinarily 
not produced by, the host microorganism whereas the term 
•homologous" refers to a gene or polypeptide which is produced 
in the host microorganism, such as £. col i . OKA recono f na t i on 
is performed outside the microorganisms but the resulting 
"recomoi nant" plasmid can be introduced into oi croorgani sms by 
a process known as transformation and Urge quantities of the 
heterologous gene-containing recombinant plasoid obtained by 
growing the trans fornant . Moreover, where the gene is properly 
inserted with reference to portions of the plasmid which govern 
the transcription and translation of the encoded ONA message, 
the resulting plasmid or "expression vehicle", when 
Incorporated into the host microorganism, directs the 
production* of the polypeptide sequence for which the inserted 
gene codes, a process referred to as expression. 

Expression is initiated in a region known as the promoter 
which is recognized by and bound by RNA polymerase. In some 
cases, as in the trp operon discussed infra , promoter 
regions are overlapped by "operator" regions to form a combined 
promoter-operator. Operators are DMA sequences which are 
recognized by so-called repressor proteins which serve to 
regulate the frequency of transcription initiation at a 
particular promoter. The polymerase travels along the DMA, 
transcribing the information contained in the coding strand 
from its 5' to 3* end Into messenger RNA which is 1n turn 
translated into a polypeptide having the amino acid sequence 
for which the ONA codes. Each amino acid is encoded by a 
unique nucleotide triplet or "codon" within what may for 
present purposes be referred to as the "structural gene", i.e. 
that part which encodes the amino acid sequence of the 
expressed product. After binding to the promoter, the RNA 
polymerase first transcribes nucleotides encoding a ribosorac 
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binding site, then a translation Initiation or "start" signal 
(ordinarily ATG, which In the resulting messenger RNA becomes 
AUG), then the nucleotide codons within the structural gene . 
Itself. So-called stop codons are transcribed at the end of 
the structural gene whereafter the polymerase mcy form an 
additional sequence of messenger RNA which, because of the 
presence of the stop signal, will remain untranslated by the 
ribosomes. Ribosomes bind to the binding site provided on the 
messenger RNA, in bacteria ordinarily as the oRHA is being 
formed, and themselves produce the encoded polypeptide, 
beginning at the translation start signal and ending at the 
previously mentioned stop signal. The desired product is 
produced if the sequences encoding the ribosome binding site 
are positioned properly with respect to the AUG initiator codon 
and if all remaining codons follow the initiator codon in 
phase. The resulting product may be obtained by lysing the 
host cell and recovering the product by appropriate 
purification from other microorganism protein. 

Polypeptides expressed through the use of recombinant OHA 
technology may be entirely heterologous, as in the case of the 
direct expression of human growth hormone, or alternatively may 
comprise a heterologous polypeptide and, fused thereto, at 
least a portion of the ar.ino acid sequence of a homologous 
peptide, as in the case of the production of intermediates for 
somatostatin and the components of human insulin. In the 
latter cases, for example, the fused homologous polypeptide 
comprised a portion of the amino acid sequence for beta 
gal actosidase. In those cases, the Intended bioactive product 
is b ioinac tl va ted by the fused, homologous polypeptide until 
the latter Is cleaved away In an extracellular environment. 
Fusion proteins like those just mentioned can be designed so as 
to permit highly specific cleavage of the precursor protein 
from the intended product, as by the action of cyanogen bromide 
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on methionine, or .1 t.rna ti vel X »y -ty-.tlc cleavage, 
eg.. C.8. Oitent Publication Mo. 2 00? 676 A. 

Human insulin has hitherto been obtained employing 
techniques of recombinant ONA technology. The process used a 
synthetic gene for the A chain >hich is expressed in E. coH 
end . separate synthetic gen, for the B ch.in -hie is 
expressed in .nother £. coU. 0.». Goeddel et jU . Proc. Nat. 
Acad. Sci . . USA. 76. 106 (19791. The two cnains .re obtained 
as chimeric Polypeptides (proteins) eo-prisin, the desired 
sequence of amino acids (either the A or B ch.in sequend 
tounu to .-other section of c.rri.r polypeptide de,i 9 ned to 
protect the desired sequence from prote.ses in the E. coH. 
The chin-eric protein, have . selective cleavage site adjacent 
the desired polypeptide sequence of the A or B ch.in which 
permits separation of the desired sequence from the c.rrier 
polypeptide. Isolation of the t-o sequences Is followed by the 
formation in vitro of the disulfide bonds. 

This process Is necessarily complicated by the fact that 
two distinct genetically codified bacterial strains must be 
obt.ined and maintained. Further, the prior process requires 
separate isolation of the A and B chain .nd the crosslinxing of 
the two Chains by means of the formation of the disulfide bonds 
without the aid in orientation provided by an int.Ct C ch.in. 

The present application describes a process for the 
construction of . Single 9 ene to express, in a single 
.icrooroanisn. . chimeric protein which includes . complete 
human proinsulin polypeptide or .n analog thereof differing 
from human insulin in only the .-(no .cid sequence of the C 
eh4 i„. Human insulin can be cleanly excised froa these 
polypeptides after in vUro formation of the disulfide 
crosslinks between the A and B chains. 
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By this process, proinsulin, and analogs thereof, can be 
directly obtained in substantially pure form and free of 
biologically active impurities. Similarly, the proinsulins can 
be effectively processed to obtain insulin chemically Identical 
to human insulin similarly free of biologically active 
impurities thus promising a more effective treatment of human 
diabetes than possible using animal derived Insulin. 

Summary of the Invention 

Tne present invention provides a method for obtaining human 
insulin by means of a chimeric folypeptide cotprising the polypeptide 
sequence of hunan proinsulin. or an analog thereof differing from the 
polypeptide sea.uence of human proinsulin in the sequence of amino 
acids comprising the C chain, fused to additional protein or 
protein fragment, there being a selective cleavage site which 
permits cleavage of the proinsulin or its analog from the additional 
protein or fragment. The cleaved proinsulin product may then be 
caused to orient by formation of the characteristic insulin A 
and B chain disulfide crosslinks and the crosslinked insulin 
precursor may then be excised from the "C" chain carrier. 

The "C* chain ( or , . here 1 na f ter", bridging chain] of amino 
acid units of the analog proinsulins made according to the 
present invention may comprise as few as 2 amino acid units. 
The identity and sequence of amino acid units Intermediate the 
ends of the bridging chains in analogs of human proinsulin are 
not particularly significant. However, the end units thereof 
oust be units which permit facile excision of the bridging 
chain from the A and 8 chains of human insulin. Preferably, 
excision occurs after the proinsulin molecule has been cleaved 
from the addition protein of the chimeric polypeptide and, most 
preferably, after the thus cleaved proinsulin molecule has been 
folded and the disulfide links between the A and 6 chains 
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characteristic of human iniulin have been formed. 
Preferably, the bridging chain has sites which permit its 
excision by enzymatic means. Preferred for this purpose 
ace Arg-Arg and lys-Arg units on the bridging chain which arc 
adjacent the terminal -COOH end. of the B chain and terminal 
. hh g end of the A chain, respectively, as is found in hunan 
proinsulin Itself. Also preferred are two Arg-Arg units 
between the 8 and A chains. 

The chimeric proteins of the present invention are obtained 
by expression of * heterologous structural gene for the 
proinsulin or analog In a recombinant microbial cloning vehicle 
In which the gene is in reading phase with a DMA sequence 
coding for an addition protein portion of the chimeric protein 
and the cleavage site. In preferred embodiments of the 
invention, the cleavage site is methionine at the M-terminal of 
the proinsulin which permits cleavage using cyanogen bromide. 

The additional protein can vary but the preferred 
additional protein is methionine amino acid or the presequence 
of preproinsulin or a portion thereof or B-gal actos i dase 

or a substantial portion thereof or a portion of the amino acid 
sequence encoded by a fragment of the trp leader polypeptide 
gene fused to a portion of the trp E polypeptide gene or the 
trp 0 polypeptide. The added, ultimately superfluous portion 
of the chimeric protein is selected to provide protection from 
bacterial proteases which night otherwise digest the proinsulin. 

The preferred recombinant microbial cloning vehicle is a 
modified, preferably bacterial, plasoid containing the 
Structural gene In a reading phase with the DMA sequence which 
codes for the added, superfluous portion of the chimeric 
polypeptide and the selective cleavage site. 

The manner 1n which these and other objects and advantages 
of the Invention are achieved will be apparent to those skilled 
in the art after consideration of the following description of 
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the preferred embodiments and the illustrations of Figs. l-G. 

Brief Description of the Drawing* 

Figure 1 shows the nucleotide sequence of a gene for human 
prolnsul In . 

Figure 2 Illustrates a scheme for obtaining a plasraid 
containing a fragment of the gene of Figure i wnich was derived 
using reverse transcription froo nRNA. 

Figure 3 illustrates a scheme for obtaining a plasald 
containing the gene of Figure 1 suUa&le for transformation 
into, e.g., E. col i for expression of human proinsulin. 

Figure 4 Illustrates the analysis by HPLC chromatography of 
human proinsulin obtained by expression from an, e.g., £. col i 
transfonnant containing the plasrald derived from the scheme of 
Figore 3. 

Figure 5 Illustrates segments of a gene for expression of 
an analog of human proinsulin differing from human proinsulin 
in the amino add sequence of the C, bridging chain. 

Figure 6 illustrates a scheme for assembling a plasmid 
containing a gene for transformation Into, e.g., £. col 1 for 
expression of an analog of human proinsulin. 

Detailed Description 

A. Preparation of Human Proinsulin 

1. Preparation of Synthetic Gene Coding For The 32 
M-Terminal Amino Acids of Proinsulin 
a. 01 igonudeotide Synthesis 

A series of 16 oligonucleotides, short nucleotide 
chains 10-12 units in length shown in Table 1, were prepared as 
a first step to the construction of a gene coding for the first 
32 amino acids of proinsulin, the amino acid sequence of which 
Is shown in Fig. 1 as * P*rt of the nucleotide sequence of the 
entire gene ultimately constructed for use in the present 
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invention for the expression of human prolnsuHn. The 
Individual nucleotides in the gene are identified by the 
letters A, T , C or C representing the oases adenine, thyoine. 
cytosine or guanine which distinguish one nucleotide from 
another. 

TABLE 1 

01 i gonucleotides: 

AATTCATCTT 
CGTCAATCAGCA 
CCTTTGTCCTTC 
7CACCTCGTTGA 
TTGACGAACATG 
CAAAGGTGCTGA 
AGGTGAGAACCA 
AGCTTCAACG 
AGCTTTGTAC 
CTTGTTTGCGGT 
GAACGTGGTTTC 
TTCTACACTCCT 
AAGACTCGCC 
AACAAGGT AC AA 
ACGTTCACCGCA 
GTAGAAGAAACC 
AGTCTTAGGAGT 
GATCCGGCG 



HI 
H2 
H3 
H4 
HS 
HO 
H7 
H8 

Bl 

82 

63 

B4 

B5' 

86 

87 

B6 

B9 

BIO' 
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The synthetic nucleotides Are shown between 
brackets in Fig. 1. These ol igonucleotides were 

synthesized by the trlester method: R. Crea e_t al_. , Proc. Hat. 
Acad. Sc i . . USA, 75 , 5765 (1978 ), t . Itakura et aK . J . Biol . 
Chen., 250 . 4592 (1975 ) and It . Itakura et «K . J. Am. Chea. 
Soc . , 97 . 7327 ( 1 97 5 ). Sooe of these are oligonucleotides 
which were used In a gene coding for the B chain of human 
insulin previously described by Crea e_t aK • Proc - Mat - Acad * 
Sci . , USA ., 75 , 5765 (1973), and Goeddel. et aK Proc. Hat. Acad. 
Sci., USA. 76. 106 (1979). The nucleotide sequences of two 
synthetic nucleotides ( 85 ' and 810*). were synthesized for this 
project; the others were prepared according to Crea e_t aK 
(supra.) The two new oligonucleotides, also prepared according 
to Crea et incorporate restriction enzyme recognition 

sites for Hoall and terminal BamHl, the latter used for 
cloning. The other end of the gene contains a sticky end of an 
EcoRl site for cloning purposes. 

b. Joining of Synthetic Oligonucleotides 

The eight oligonucleotides H1-H8 were used 
previously to construct the left half of the B chain gene. This 
was used in this process and is described by Coeddel et a 1 . . 
Proc. Natl . Acad. Sci. USA 7f. 106 (1979). It contains the 
codens for the 1-13 amino acids of the 0 chain gene and a 
methionine unit at the N-terminal, used later to cleave the 
proinsulin from bacterially expressed chimeric protein using 
cyanogen bromide (CMBr). 

The right half of the 8 chain gene was obtained 
from the oligonucleotides Bj, 8 2 , B 3 . B 4 . Bj ' , B fi , 
B ? . B 8 . B 9 and B l0 *. by ligation using T 4 ligasc anu 
a technique described by Coeddel et aK (supra). The gene 
fragment produced codes for the 14-30 amino acid units of the B 
chain and the first unit, Arg, of the bridging chain. 
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Incorporated Into the gene sequence (s an HoaJI restriction 
eniyme site in the same reading frame and location as an Kpa l I 
$Ue In the human insulin gene. After purification of the 
llgated gene fragment by pol y a c ry 1 a ni de gel electrophoresis, 
and elution of the largest OKA band, the fragment was inserted 
into tne plasmid pOR322 that had. been cleaved with restriction 
endonucleases Hind i! I and DaraH 1 , thereby utilizing the Hlnd Ul 
and BamH! sites on the synthetic gene fragment. The DMA was 
inserted into E . col j 294 (ATCC Mo. 3K45) by transformation. 
One plasmid, pB3* recovered' from an acpid'llin resistant, 
tetracycline sensitive clone was found to possess the desired 
nucleotide sequence according to the method of A.M. Maxan 
et aK Proc. Hatl. Acad. Sc1 USA 74 , 550 (1977 ). 

From the two plasmids pBH 1 and p33 * two OKA 
fragments were recovered, a 46 base pair Ec_oRi to Hind i 1 1 
fragment from pBHl, and a 58 base pair Hind lll to BamHl 
fragment from pB3*. 

The two fragments were ligated together to produce 
a fragment having an Ec_oRI site and a BamH! site. This 
fragment was inserted In plasmid pBR322 which had been treated 
w1th Eco ftI * nd BamH I restriction endonucleases using the method 
described in Coeddel et aj_. Proc. Hat. Acad. Sci . , USA , rs, 106 
(1979) and cloned in E. col i K-12 strain 294 (ATCC No. 31446) 
to provide the plasmid pIB3. After cloning, the plasmid p!B3 
was cleaved with Sco Rl and Hpa 1 1 restriction endonucleases to 
recover the synthetic gene fragment (Fragment 1, Figure 3) 
containing the codons for the M-terminal proinsulin amino acids 
preceded by a methionine codon as shown in Figs. 1 and 3. The 
synthetic gene was Isolated by polyacryl ami de gel 
el ec trophores 1 s . 

2. Isolation of A cOMA Gene Coding For the 55 C-Terminal 
Amino Acids of Human Proinsulin 
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The scheme for obtaining the cOMA gene 1$ schematically 
shown In Fig. 2. 

A decanucl eotl de was synthesized containing the 
recognition sequence for BaoHI endonuc 1 ea s es to which was added 
a 3* polythymidyl 1c acid tract of approximately 20 residues. 
Its sequence is pCCCGATCCGGTTjgT . This oligonucleotide was 
used to prime AMY reverse transcriptase for cDNA synthesis. 

The primer was prepared using terminal deoxynud eoti dyl 
transferase Unzo Biochem, 200 units) with one pool e of the 
BamHl dec anuc 1 eo tl de in a reaction voluoe of 0.6 ml containing 
l.S * 10"* TTP. The reaction was conducted at 37*C for one 
hour in a Duffer system described by A. Chang e_t al_. Mature . 
275 , 617 (1978). 

Human Insulinoma polyA tissue (2.5 ug) provided by the 
Institute fuer Oiabe tesf orschung , Muenchen, West Germany 
(Or. Wolfgang Kemoler) containing mRNA isolated by the process 
of Ullrich et al , Science , 196 , 1313 (1977) was converted to 
double stranded cONA by a procedure according to Wickens et aK 
J. Biol . Chem, 253 , 2483 (1978). Thus, 80 ul containing 15 mM 
Tris/HCl (pH 8.3 at 42 - C). 21 oH KC1 . 8mH MgCl 2 , 30 mH 
e-mercaptoethanol , 2 mH of the primer dCCGGATCCGGTT^gT , and 1 
mH dNTPs was preincubated at 0*C. Then 40 units of AMY reverse 
transcriptase were added and the mixture incubated for 15 
minutes et 4Z*C. 

The complementary cONA strand was synthesized in a 
volune of 150 yl containing 25 mM Tris/HCl (pH 8.3), 35 mM KC1 , 
4 mM MgCl 2 , 15 mH i-mercaptoethanol and 9 units of DMA 
polymerase I (Klenow fragment). The mixture was Incubated at 
15*C for 90 minutes followed by 15 hours at 4'C. 51 nuclease 
digestion was then performed for 2 hours at 37'C using 1000 
units of SI nuclease (Hlles Laboratories) as described by 
Wickens rt aK supra . The double ttranded cOMA (0.37 yg) was 
subjected to electrophoresis on an 8 percent pol yacryl ami de 
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gel. UNA fragments larger'than 500 base pairs were eluted. 
01 igodeoxycy tidylic acid residues were added to the 3' ends of 
the fragments using terminal deoxynuc 1 eo ti dyl transferase by 
the procedure of J.V. Haizel Jr., Heth. Virol. , 5. 180 (1971). 
The dC tailed cONA fragments were annealed to pB«322 that had 
been cleaved with the restriction endonuclease £s_tl anc tailed 
with deoxyguantdyl 1c acic using terminal deo*ynud eoti dy 1 
transferase. The resulting plesoids were transformed Into 
E. col 1 K-12 strain 29s and cloned. Colonies resistant to 
tetracycline but sensitive to arapiclllln were Isolated and 
screened for plasmids having three sites cleavable by the 
restriction endonuclease Ps_tl indicative of the presence of the 
gene for Insulin. Sures et al_. Science , 208 , 57 (1980). 

One plasmid, pHll04, containing a 600 base pair insert 
and giving the anticipated Pst I restriction pattern was 
determined to contain a site cleavable by BajnHI between the 3' 
polyA and the polyGC introduced during the cOWA preparation. 
Some of the nucleotide sequence of the insert is shown in 
Fig. 1. This sequence differs slightly from that previously 
reported by I. Sures et al_. Science , 208 , S7 (1980 ) and C. 
Bell, et aj_. Nature , 282 , 525 (1979), having an AT base pair 
where underlined rather than a CC pair, because the mRMA used 
was from tissue isolated from a different Individual. The 
resistance to antibiotics conferred on a bacterium by this 
plasmid 1s Indicated by the marker Ap S for ampicillln 
sensitivity and Tc r for tetracycline resistance. 

3. Assembly Of A Gene Coding For Human Prolnsulln 

The scheme used for assembling a gene coding for human 
proinsulln 1s shown in Fig. 3. 

The synthetic gene segment coding for the first 31 
amino acids of proinsulln, fragment 1 In Fig. 3, was recovered 
from 50 w9 of the plasmid plB3 using the restriction endo- 
nucleases E£oRI and Hpall as described above. Thij fragment 
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also contains the codon ATC for methionine in pUce of the 
-presequence" of preproi nsul i n . Introduction of * methionine 
unit at this point permit* the polypeptide ultimately expressed 
to be cleaved at this point by cyanogen bromide (CHOr) to 
separate the proinsulin from the residue of the polypeptide 
which served to protect the proinsulln portion from bacterial 
pre tea se $ . 

The cDMA gene segment coding for amino acids 32-85, as 
well as the translation stop codons and the 3' untranslated 
recior. of the eiRUA was recovered from 40 «9 <>f the plasraid 
pHMC- by treatment first with BamHl and then Hpall as shown in 
F i ; . 3 as fragment 2 . 

The two fragments were isolated by polyac ryl ami de 
electrophoresis followed by el ec t roel u ti on . The gene fragments 
were joined by treatment with T4 DMA ligase in 20 gl ligase 
buffer (Goeddel e_t aj_. Proc. Hat. Acad. Sc i.. USA, 76, 106 
(19/31 at 4'C for 24 hrs. The mixture was diluted with 50 i»l 
K 2 0, extracted with phenol, then chloroform and then 
precipitated with ethanol. 

The resulting DMA was treated with BamHl and EcoRI to 
regenerate these sites and remove gene polymers. The assembled 
proinsulln gene was Isolated by pol yacry 1 ami de gel electro- 
phoresis and ligated using T4 ligase to the plasmid pBR322 
which had previously been treated with EcoRI and BamHi. The 
resulting OKA was transformed into E. coU K-1Z strain 294 and 
cloned. Colonies were screened using the plasraid conferred 
antibiotic resistance markers.* The desired clones were tetra- 
cycline-sensitive (Tc $ ) and ampicl 11 1 n-res i s tant (Ap r ). 
Plasmid pHl3 was isolated fron one such colony and the 
proinsulln was characterized by nucleotide sequence analysis 
and found to have the sequence shown in Fig. 1. 

4. Construction of a Plasmid Designed to Express a 
Chimeric Protein Containing the Human Proinsul in Peptide 
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Plasnid pBRMl. la.I. Rodriguez. et. il.. Nucle,£ AC,< " 
Research 6 . 3257-3287 (19791 expresses KpieiUin resistance and 
contains the ,.». for tetracycline resistance but. there being 
no associated propter, does not express that resistance. The 
plasnid is according tetracycline sensitive. By introducing 
. promoter-operator system in the EcoRI site, the plasnid can 
be made tetracycline resistant. 

Plasnid pCHl carries the E. coVi. tryptophan operon 
containing the deletion IE1U3 (C.F. Hlou.rl. et al_. . (1978) 
j. Bacteriology H57-14S5)) and hence expresses a fusion 
protein comprising the first 6 amino acids of the trp leader 
4 nd approximately the last third of the trp E polypeptide 
(hereinafter referred to in conjunction as IE'), as -ell as the 
trp 0 polypeptide In its entirety. .11 under the control of the 
trp promoter-operator system. The plasnid. 20 »9. was digested 
with the restriction emyme Pvull -hich cleaves the pl.smid at 
five sites. The gene fragments 2 -ere next combined with EcoRI 
linkers (consisting of a s el f compl ementary oligonucleotide 3 
of the sequence: pCATGAATTCATG) providing an EcoRI cleavage 
site for a later cloning into a plasnid containing an EcoRI 
site. The 20 „g of DMA fragments Z obtained from pGMl -ere 
treated with 10 units of T< DMA ligase in the presence of 200 
pico moles of the 5 • -phosphoryl a ted synthetic oligonucleotide 
pCATGAATTCATG and in 20 .1 T< DMA ligase buffer (20mH tris. 
pH 7.6. 0.5 mM ATP . 10 mH KgClj, SmM di th 1 othrei tol ) at «'C 
overnight. The solution was then heated 10 minutes at 70'C to 
halt ligation. The linkers -ere cleaved by EcoRI digestion and 
the fragments, no- -1th EcoRI ends -ere separated using S 
percent pol yacryl ami de gel electrophoresis (herein after 
•PAGE') and the three largest fragments Isolated from the gel 
by first staining with ethidiuo bromide, locating the fragments 
with ultraviolet light, and cutting from the gel the portion, 
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of interest. Each ,«1 fragment. -U« 300 microl i ter s O.LTBE. 
-as placed <« » dialysis big and subjected to electrophoresis 
at 100 v for one hour in O.LTBE buffer (T8E buffer contains: 
10.8 gm tris b.se. 5.5 g. boric acid. 0.09 gm H» 2 E0TA in 1 
liter H 2 C). The aqueous solution us collected froo the 
dialysis bag. Phenol extracted., chloroform extracted and made 
0.? H n«C I . and the ONA removed in H ? 0 after EtOH 
precipitation. 

9BRH1 was digested with EcoRl and the en:yoe removed by 
pn.nol extraction followed by chloroform extraction and 
recovered in water after ethanol precipitation. The resulting 
C«A oolecule was. in separate reaction mixtures, combined with 
each of the three ONA fragments obtained above and ligated with 
T t OKA Hgase as previously described. The DNA present in 
the reaction mixture was used to tranform competent E. coTj. 
K-12 strain 294. K. Backman et al.. . Ptoc Kafl Acad Sci USA 73. 
<17C-«19a [1976]) (ATCC no. 314CS) by Standard techniques (V. 
K.rs.ficK et .J... Prnr M.fl Acad Sci USA 71. 3«5-3<5S 
[!97<]) and the bacteria plated on 18 plates containing 
ZC vg/T.1 anpicillin and 5 „g/ml tetracycline. Several 
tetracycline-resistant colonies were selected, plasnid OH A 
isolated and the presence of the desired fragment confirmed by 
restriction enjyme analysis. The resulting plasmid was 
designated pBRHtrp. 

p6RH trp was digested wih EcoRI restriction enzyme and 
the resulting fragment isolated by PASS and electroelution. 
EcoRl-digested plasmid pSOMall (K. Itakura et al_. . Scjence 198. 
10S6 (1977); 6.8. patent publication no. 2 007 676 A) was 
combined with this fragment. The mixture was ligated with T { 
ONA ligase as previously described and the resulting ONA 
transformed into E. cp_M strain 29* as previously 

described. Transf ormant bacteria were selected on 
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amptclll in-contatn'ing plates. Resulting ampi c i 11 1 n- res t $ tan t 
colonies were screened by colony hybridization (M. Cruenstein 
et aK. Proc NatM Acad Sci USA U, 3951 -3965 [ 1 975 ]) using as 
a probe the trp promo ter-opera tor-conta 1 n Ing fragment isolated 
from pBRHtrp, which had been radi oac ti vely labelled with- 
P 32 . Several colonies shewn positive by colony hybridization 
we»-e selected, plasaid ON A was isolated and the orientation of 
the Inserted fragments determined by restriction analysis 
employing restriction enzymes BglU and BanHI. In double diges- 
tion. E. cell 294 containing the plasrald designated pSCM7a2. 

Ma said p3»322 was Hi ndl 1 1 dl ges ted and the protruding 
Hindlll ends In turn digested with SI nuclease. The SI 
nuclease digestion involved treatment of 10 w g of 
Hindlll-cleaved pBR322 in 30 *1 SI buffer (0.3 M NaCl , 1 nH 
2nCl 2 , 25 mM sodium acetate, pH 4.5) with 300 units SI 
nuclease for 30 minutes at 15'C. The reaction was stopped by 
the addition of 1 yl of 30 X SI nuclease stop solution (O.BM 
tris base. 50 mM EDTA). The mixture was phenol extracted, 
chloroform extracted and ethanol precipitated), then EcoRI 
digested as previously described and the large fragment 1' 
obtained by PAGE procedure followed by el ectroel ution . Tne 
fragment obtained has a first EcoRI sticky end and a second, 
blunt end whose coding strand begins with the nucleotide 
thymidine. 

16 mS Plasmid pS0M7a2 was diluted into 200 «1 of buffer 
containing 20 mM Tris, pH 7.S, 5 mM MgCl 2 , 0.02 percent NP40 
detergent, 100 mM Natl and treated with 0.5 units EcoRI. .After 
15 minutes at 37*C, the reaction mixture was phenol extracted, 
chloroform extracted and ethanol precipitated and subsequently 
digested with Bgl II. The larger resulting fragment V was 
isolated by the PAGE procedure followed by el ectroel u tion. 
This fragment contains the codons "IE ' ( p ) " for the proximal end 
of the L £ ' polypeptide, i.e., those upstream from the Bgl II 
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Frisa , nl 4. ,. prepared" by successive digestion of P Trp*« 
Ipr ep.red _ U pon CO*! digest of pTh.l (B J ocheni stry BO- 
(l980 „ follo-.d by KL— poly.er.se . re.cti.n to blunt th. 
£coR < residues. Bgl I. ei^tion crea.es . " «.r fr.gnent which 
... recirculated by re.ct.cn with the Lf cont.in.ng B,l II 
j;iekv ,m ....« -l« ■•'» - f ° n0 - e<1 " PA " 

.nd .i.ctro.iution. «• »■•««" d6ne u m pr " enee °' 
T, 0..A H 94 s. to fora the p1.s«1d pS0H7.*.«. -Mch 
tP .„f.r... 1«to I. coll str.in »«. » >«•!...!, ""ribed. 

Pl.t.ld PSOM7.2 w.s Bgl II digested .nd the Igl II 
sti c k y ends resulting ».de double str.nd.d with the n.now 

polyser.se I procedure using .11 f oxynude.tlde 

xMphosph.t.s. tcoRl cle.v.g. of th. resulting product 
• ,„„.-« by PAGE .nd el ectroe, utl on of the s-.11 fr. 9 ..nt 2' 
yie ,ded . Une.r piece of ONA containing the tryptoph.n 
pr...t.r..p.r.t.r .no codons of the If "proxioal' sepuence 
u?s , r ,. n fr 0a the B S 1 U site CU-(P)"). The product h.d .n 
Ee .„ en, en. . blunt end resulting fr.. filling In the .,111 
sue. Ho-ever. the Bgl .« sit. is reconstituted by lotion of 
„, S ,«t - of the fr.g»cnt 1' to the blunt end of fr. S -..t 
r Thu$ the t«c fr.gn.ents were Hc.ted in the presence of 
T. OtIA lig.se to. fore the r.c t reul .ri zed pl.soid 
.Mch v.. prop.g.ted by tr.nsforo.ti on into coopetent £. col, 
str .in m cells. T.tr.cycl.n. resist.nt cells be.Hne th. 
r,cor.»in.nt pl.ST.id pHKY 10 were grown up. pl.soid OHA 
„tr.ct,d .nd digested In turn with 1,1 ...->,« followed by 
1$ ol.tion by th, PACE procedure .nd .1 ec tro.l utlon of the l.rge 
fr.gocnt. . 1 1 ne. r piece of OHA h.ving Pst .nd Bgl U sticty 
ends to give DMA fr.gment 7/- 

M.s.id PSOH7.2.4 could be -.nipul.ted to provide . 
Jee ,nd co-ponent for . systco c.p.ble of rec.ivln, . .... 
v.riety of hetero.ogous structur., genes. The pl.soid -« 
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suojectco to partial EcoRI digestion foil—- »/ digestion 
and the fragments containing the trp promoter/operator was 
isolated by the PAGE procedure followed by el ectroelutlon. 
Partial EcoRI digestion was necessary to obtain a fragment 
which -as. cleaved adjacent to the S' end of the somatosteti n 
gene but not cleaved at the EcoRI site present between tne 
ampicUlin resistance gene and the trp promoter operator. 
Anpicillin resistance lost by the Pst I cut in the ap R gene 

could be restored upon ligation with fragment S ' . 

In a first demonstration the third component, a 

structural gene for somatostatin (£' ) -as obtained and purified 

by PAGE and «1 ectroelution. 

The three gene fragments 7'. 5' and 6' could now be 

ligated together in proper orientation, to form the plasmid 

SOM'alt*. 

The complete human proinsulin gene, including the 
N-terminal codons that code for methi oni ne, was recovered from 
the pl.scid pHI3 by treatment with EcoRI and BemHl and purified 
by gel electrophoresis. This gene, fragment 3 in Fig. 3. was 
joined to two other ONA fragments with T4 ligase; these are 
Identified as fragments « and 5 In Fig. 3. 

Fragment £' contains a promoter and a carrier protein 
gene derived from the plasmid pS0H7»l»« by partial digestion 
with leoRI and complete digestion with Pvtl . This fragment 
contains an E. eoU tryptophan (trp) promoter-operator, nine 
codons from the trp leader peptide. 190 codons from the trp E 
g,„, and an EcoRI cleavage site introduced in place of the trp 
E termination codon. (This g.ne construction will be referred 
to as trp IE ' below.) The tryptophan attenuator region 
including the last S codons of the trp leader peptide sequence 
and the first two thirds of the trp E gene are deleted In this 
conj tructi on . 
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The trp E gene (trp IE'), contained In Fragment <\ is 
modified to Incorporate an EcoRI site 1n place of the 
termination codon of the trp Z gene as shown to give the 
correct reading frame with the Inserted gene fragment 3. 

This fragment Is bounded at the opposite end by a 
PstI site derived from the p8R322 and incorporates the first 
half of the e-lactamase gene. The fragoent was recovered fron 
20 wg of plasrald pSOM7aU4 by partial digestion with Eco RI 
followed by treatment with Ps t l . The promoter containing 
fragments were Isolated by polyacryl ami de gel electrophoresis. 

Fragment 5' was obtained from plastcid pHXTlO. This 
pi asm id 1s a derivative of pBR322 and contains a tryptophan 
proooter-opera tor in place of the tetracycline promoter. The 
Hindi II site of pBR322 has been converted to a BoU I site. The 
plasmid, 20 ng. was treated wtih Pst I and Bgl 1 1 and the large 
fragment, designated £ in Fig. 3, purified by polyacryl ami de 
gel electrophoresis. 

The two fragments 4 and 5 were ligated together to 
reassemble the gene for a-lactaraase via a Pst I site and confer 
amplclllin resistance (Ap r ). The ends then present an EcoRI 
site and a Bgl 1 1 site for insertion of a gene. These two sites 
cannot be ligated together due to nonhybr i di za tion of the 3* 
protruding ends and can only be Joined by incorporating a ONA 
fragment that possesses 3* ends complementary to the Eco RI and 
Bgl 1 1 ends. The proinsulln gene, fragment 3 containing Eco RI 
and BamHI ends. Is such a molecule. Thus, the three fragments, 
5 ug of 4. 1 P g of 3 and 1 of 5 were combined and treated 
with T4 OH A Hgase at 4*C for 24 hours 1n ligase buffer. Upon 
ci rculariiatlon to close the plasmid. the tryptophan promoter- 
operator controls expression of a fusion (chimeric) protein of 
which proinsulln is a portion. Tetracycline resistance (Tc r ) 
is also conferred . 
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^r; n bQW ,.p,«i.... ° f 3 " ,onits . 

3we,c foun, 0, SOS p.1,.er,,.n«. S ., .,.«tr.pn,r.,M 

(J .;, ;.,„!. Jr.. ji £ s!L_Ii££l i -.i- 180 US?nl " eiPrtSJS 

protein of «M ght ejected of the -p U-pro- 

LuUn fusion. OneP^id.pH^asCO^ete.ch.r.c 

as t0 0„A se,uenc, .f the Incorporated 9 ene and — - 

analysis of the vector pBS32E- 
|5 Proinsulin Isolation 

The pU»-1d PHW was transform into £. coll. K-ll 

stra1n ,,303 ««CC «o. 3U0S, ... 5 - in 500 .1 .f « ■.-«« 

Spring — or ! m > con 1. 1 n 1 n, 1. .,/.!.' anpicillin 

vesse , .»"««•« -die ,H,11.r. sup.. 

* i/i nn Cells were collected by 
431.1 to a cell density of 14 00. Cens 

centrifugation ind frozen. 

cu.Ti-MtV-.'. ••»•» ^ 

buffe r U0 perle^ucrose. ..I. P-1...»-H <0TA 

/ e net was collected^ centrifugation and suspended >y 

• ... .t «'C «lih 4 volumes of 7..0M 9 uan , di n e-H. 1 . 
stirring overnight at « C 

„. i-u, cqTA. After centrifugation the 
1.H dinercaptopropanol, H»l. 

supe rnatant -as diluted — cold water and allo-ed 

it M«M.r..«.-e. T he precipitate (S.Sg-., -as coUec ed 
6y centrifugation and reacted overnight at rod* temperature In 

prol „su,in fro. the trp U" fusion. After rotary evaporation 
a„„oniu B car.on.te. and the P« adjusted to ,.0 
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|r „...f ..dl.. tetr.tnion.t, " " n " rl 

..d cystines" S-suHon.te ,r. - action stirred .t 

room teoper.ture for 6 hour*. 

TM r..etl.» .Lt.r. ... des.Ued on 

c„ uo „ ,. ».iHT.. . Ift* T 6.S. 10»K EDTA. T,e des.Uo, 

protein L.d.d onto » OEAt-sephade, (A-25) e.1- «< 
elu ted with . 1-IH.r I1...r or.dient of 0 to 0.5 H N.U in the 
.... irjisjitttb-^tr. TU proinsulin like B.terUI. 
identified by »>* »r HPLC (see be.ow). -es concentr.t.d on .n 

Ao ,con THS .e«br.n. ..d resolved In the tris- •»«' « 

C-50 Hediua e.l.«. The G-50 fr.ctions. identified by HPU. 
we r, pooUd (104.1) .nd th, buffer eh.rg.d on • c.lunn of C-2S 
Fin, equilibr.ted with 30 .» .o-o-iu. etrbon.tt. pH 8.8. The 
, y „pM1ized Protein «,i 9 n,d^l6 ^ The recoveries .t e.ch 
step ere shown in Tide 2. 
6. Proinsulin Anjlysis 

The S-sulfon.ted proinsulin ot>t S 1ned w.s en.lyted by 
am i„o ac.,d en.lysis. This .n.lysis ■ „ ..d. by E 11 UU, «- 
Co. and is shown in Table 3. 



TABU 3 





Amino Acids 
Cal cul atod 


Aaino Acids 
Predicted 


Amino Acids 
Cal cul ated 


Aoino Acids 
Predicted 


ASP 


4.40 


4 


lie 


1.34 


2 


Thr 


2.90 


3 


Leu 


12.21 


12 


Ser 


4 .50 


5 


Tyr 


3.93 


4 


Gl u 


15.64 


15 


Phe 


2.61 


3 


Pro 


3.42 


3 


His 


2.02 


2 


Gly 


11 .08 


11 


Lys 


1.96 


2 


Ala 


4 .46 


4 


Arg 


3.92 


4 


cys 


2.85 


6 


Val 


5.58 


6 
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The pretence of proinsulln was also confirmed by radio- 
immunoassay. To determine the radl o1 amunoactivi ty the Corning 
l25 I-1nsul1n kit was used. The antibody was found to be about 
4 percent cross-reactive with proinsulln and about 0.2 percent 
cross-reactive with reduced proinsulln. Unknowns were heated 2 
minutes at 90*C, In 7.5M urea, 2 mH a-mercaptoe thanol , pH 8-10 
(ethanol amine) , *nd aliquots diluted In phosphate-buffered 
saline (0.1 gelatin) and Immediately assayed. These results 
were determined from comparisons to a reduced proinsulln 
standard curve generated In the same way from either bovine 
proinsulln or human proinsulin 5-sulfonate. 

The proinsul in-S-sul fona te was also assayed by HPLC. In 
Fig, 4 profiles are shown for S-sulfonated Bovine proinsulln, 
bacterial derived human proinsulln and a cocbination of the 
two. In this analysis, samples of bovine, and human proinsulln 
sulfonate and a mixture of the two were applied to a 10 m 
RP-189 column and eluted using a linear gradient of 21 to 33 
percent n-propanol and acetonitrile (2:1) in 50 mM^MH OAc 
(pH7). The proteins are seen to run very nearly coincident. 
The large A 2?8 peak at the end of the chroma togram is due to 
rapid changes in the solvent composition and not eluted protein. 

B. Preparation of Proinsulin Analog 

Described below Is the synthesis of a gene which codes for 
the expression of an analog of proinsulin comprising the A and 
B chains of human insulin connected by 4 bridging chain which 
differs from the C chain of human proinsulin in that it 
contains only 6 amino acid units rather than the 35 unit 
polypeptide of human proinsulin shown In Fig.l. Specifically 
the 6 units are, reading In order from the last unit of the B 
chain to the A chain, Ar g-Ar g-Gl y- Ser- Ly s-Arg . This sequence 
has the sane end sequences Arg-Arg and Lys-Arg as does human 
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pr.iniuHn thus ptr»1«l«9 e.ei**jo» of th. br(fl 9 <« 9 ty 

proteolytic means. 

A chain of 6 amine adds 1s an acceptable length for a 
modified tor analog) bridging 'C" chain which will permit 
folding ane the subsccuent formation of the disulfide 
crosslinks between A and 8 chains characteristic of horcone 
insulin. However, those skilled In the art wilt appreciate 
that bridging chains shorter or longer than 6 would also be 
f useful as well by permitting folding and the formation of the 

necessary disulfide bonds. Sequences of 100 or even more amino 
acid units can be employed in the bridging chain.- However, the 
practical difficulty of obtaining gene fragments coding for 
very long sequences makes the bridging chain analogs of fewer 
than 35 amino acid units more attractive from l practical point 
of view. 

The ends of the bridging chain, no matter how many 
intermediate amino acid units or in what order, must be 
constructed to permit excission of the bridging chain. 
Although alternative means may be employed, we prefer to use 
the sequences Arg-Arg and Arg-lys as found In proinsulin itself 
{ as proteolytic cleavage using trypsin and carboxypep tidase B 

occurs cleanly at these sites. 

1. Preparation of Synthetic Gene Coding for the 57 A^ino 
Acids of an Analog Proinsulin 

a. Oligonucleotide Synthesis 

The chemical synthesis methods as well as the 
synthesis of the OHA gene fragments coding for the A and B 
chains of human insulin have been described. K. Itakura 
et al_. t J . Biol. Chec . 250, 4592 (1975), K. Itakura et aK *L 
Biol. Chen., 250. 4592 (1975 ). Itakura et a_K J. Am. Ch e m_. 
Soc , 97 . 7327 (1975 ), Crea et aK Proc. Hat. Acad. Scl.. USA, 
75 . 5765 ( 1978) *nd Coeddel et al_- Proc. Hat. Arad. Sci.. USA. 
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, itrl M in U. ..." CU- references. - < — 
sh o-» belo- in Table 4 and Fig. 5. 



TABIC « 



01 laonucleotides 



C 2 



C 3 



AAGACTCGTOOGIG 

GAKCAAGCSJ1GGCATC 

CATCCACGACHAGTCTT 

CAACGATGQJTOCGOTTG 

TCGACTAT7WCTT 



• b Joining of Synthetic Oligonucleotides 

Figure 5 shows the syndic oligonucleotides-^ 
«. insulin A and 8 c..1. 9-es previously pr.p.red ^. «M 

M ».. M fr„.....e l .e s « 

, nlyra .tic construction of . « W 1.« * '"«« W - 

...1.,. T.. ...... <*r obtaining tMs gene 1, set ,. 

A pl.i-1. PEH1 containing the left half «T the B 
chiin gene -as «...-«!. P-ess and 1 s des cri be « by Gocdde, 

""ins «M codons for the ,-13 ..... — the ■ chain >» 
. .ethionin. »« « N- term! nal -hich -Ill M — 

...... mV.i.»»- m ewrelied 

chitaeric protein usin 8 cyanogen bro.ide (CHSr). 

Tne gene for the right ..If of the B ch«in -as 
obtained from the •ligoruel.otia.. Bj . B ? . . 6,. C,.. 
B 6 .ndB ? . B 8 . B,.nd C, U.—C, replacing tne 
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8 5 and D 1Q sequences In the previously prepared gene frag- 
ment) by ligation using T4 Mgasc in conventional techniques. 
This gene fragment codes for the 14-30 amino add units of the 
B chain and the first two units Arg-Arg of the bridging 
(nodifled "C") chain-pBCl. 

After purification by ?ol yac ry i ami de gel electro- 
phoresis and elution of the largest DMA band the fragrant was 
inserted into the plasmid p3R322 that had been cleaved with 
restriction endonuc 1 ca s es Hlndil I and GamHl , thereby utilizing 
the Hindi 11 *nd BamHl sites on the gene fragment, tnt cloned in 
E. coll 294 (ATCC Ho. 31446). The plasmid pBC recovered from 
an ampicillin-reslstant. tetracycline sensitive clone possessed 
the desired nucleotide sequence according to the method of A.M. 
Maxam et aj_. . Prnc. Nat. Acad . Sci., USA, 7<; S60 (1977 ). 

The A gene was constructed similarly from 
oligonucleotides C 2 , A 2 , A3. A 4 , A $ , A fi , C fl . 
A 8 . A 9 , A lQt A n and C 5 (C 2 , C< and C $ 

replacing the A x . A ? and A l2 sequences in the previously 
prepared gene fragment) using T4 ligase in conventional 
techniques. This fragment codes for the 21 amino acids of the 
A and the four units of the bridging C chain, Gl y- Se r-lys- A rg . 
After purification, also by pol y aery 1 ami de electrophoresis, the 
fragment was inserted Into the plasmid P BR322 which had been 
cleaved with restriction endonuc 1 ea s es EcoRI and SaU using the 
EcoRI and SaU sites on the fragment and cloned in £. coU_ 
29 4. An ampiciUin-resistant. tetracycl ine-sensitive clone 
yielded plasoid pCAlB having the desired nucleotide sequence by 
the method of A.U. Maxam et aU . supra_. 

2. Construction of the Proinsulin Analog Gene and 
Corresponding Expression Plasmid 

The desired expression plasmid was prepared from 
plasmids pBHl. pBCl and pCAlB as shown in Fig. 6. The plasmid 
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pBHl -as dem< with Hlndlll and llgated to the fragment BC 
excised fr.n p1..»ld PBCl Dy treatment -<tn Hindi 1 1 "d S«MI. 

Th.'rt.ultl.g pla.»ld P»" 3S «» e1t41 " 11 - Uh " a ,,9<tt<1 

to an EcoRI fragment of placS which contains the Uc control 
rejion and tne c.jority of the .-gai .etoi 1 da s< structural gene 
(designated 2). K • It.kur. et .1. Science. 198. 1056 (1977). 
This ligation produced plasnid pt8254 which was cleaved with 
BetaHl. SaU and alkaline phosphatase. The product of this 
cleavage was Mgated to fragment CA excised fro. ples.id PCA18 
by treatnent with SjfnHl'and UU as shown In Fig. « and trans- 
formed Into £.. coU 294 and cloned. The plasnid pBCAS .as 
recovered fro- aopie 1 1 » 1 n-resi s ten t . te tracycl ine-sensi ti ve 
clones in E. colj. 29ft grown on X-gal plates containing 
» B picill1n and contained the DMA coding sequence for the 
proinsulin analog as indicated by the method of A.M. Max.* 

e_t'aK. suora . 

3. Expression of Proinsulin Analog 

The fully characterized plasmid pSCAS was inserted into 
E. eoii RV308 and grown in four-liter flasks containing 1.5 
lUer LB containing 20 og/1 anpicillln. Recovered cells ( 322 * 
wet.) were lysed 6y sonication in two liters of 10 percent 
sucrose. 50 .N EOTA , 0.1 M tris/KCl. pH 7.9. 0.1 H 
phenyloethylsulfonylfluoride. 0.2 M HaC1 . and 1 oH 
1.3-dlthio.2-propanol. After cen tri f u S a t i on (3b .In. .SCOC rpn) 

the.Hliy-- 45 » s P en<le<! ^ »t<" ,B » ift iCC ? uiMdine 
jjyjrochlo.r.ide. 0.1 «H. 1 .3-di thi o-2-propanol . 1 .K EOTA. 7»1» 
suspension was centrifuged 130 .In. 12.000 rp.) and the 
supernatant diluted six-fold into cold water. The 
precipitation protein 112.4 9 dry weight) was collected by 
centrifugation (20 •!.. 5000 rp«) and treated overnight at RT 
with 2.8 g (26.4 B »ol ) CHBr in 200 .1 88 percent formic acid to 
cleave the proinsulin analog (hereinafter •analog-'C'- 
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protnsulinl fr.o th. ..,.1 «c t.. «... .'»««•» 
N-«,thioh1n, unit. After rot.ry e-por.tvon to dryness .t 
under JO'C -.t.r w.s a"ed .nd the ..l.oe .g.l- reduced. The 
res ,„ suspended l.JOO nl 6 h gu.nidine hydrochl oride _.nd 

th , pK .djusted -.o^^l^jU^L. T ° th1 * 
IS. 5 9 sodlun su<Mt._.«d 7 .S 9 s odiua tetr.tMon.te_ to cenver; 
^7.;' .'n7'c7s";Tncs toc/steine S-s'ulfon.te groups .nd the 
• Uturt .nowed t o re.ct . t_ro ^ t«nper.ture for .U.hours. ^ 
The reaction e^.ustl.ely di.l,:.d (Spectrop.r J) 

1 pM EOT A .t «'C . nd the precipit.ted protein 
s . sulf0 n.tes collected by centri fug. tion . (20 -in. SOOO rp-). 
7 h 7^TT77 4I suspended in SO .1 TJ Tris/HCl, ph 7.5. 
fUtered and lo.ded onto . OC-52 colunn (2.5 , 87 c) in the 
„ me buffer .t «*C. The column «.s eluted with . linear 
or.Oient of 0 to 0.5 K Had in the sane buffer. The presence 
of .l„i-C proinsulin S-sulfon.te w.s detected by insulin R.A 
.n.lyses described below to .lute .t the end of the 00 peak. 
The pooled fr.ctions were applied to . 6. -50? Seph.de, c.l uan 
,2.5 , 100 e.) '.nd .luted with 7 M ur.,^ 50 »M tris/HCl. P« 
7.S. 1 mrl EOTA. Fr.ctions containing insul in-RIA-.etive 
o.teri.l were pooled .nd di.lyied .g.inst 20 .H .oaoniun. 
c.rbon.te. pH 8.8. .nd lyophilU.d. The white powder w.s 
resuspended in 7 .1 20 oH .aooniun c.rbon.te. pH 8.8. .nd 
stored at -40'C. A portion of the product pool w.s further 
purified by prep.r.tive H.PL.C .nd the insulin RIA-.ctive peak 
an.lyied for .mino-.cid content which is shown in T.ble S. 

The an.log-T.' proinsulin S-suHon.te was purified by 
HPLC by prep-collection from .n .n.lytic.l C-8 column. The 
resolving gr.di.nt w.s 11-39 percent 2-prop.nol In 50 «H H.OAc. 
P H 7.0. .t 0.6 P ercent/»in .nd 2 .Win. Up to 1.5 Ml. of 
protein solution w.s resolved in one run by three successive 
injections vi. » 500 »1 loop. 
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TA8LE S 

Amino Add Analysis of 24. hour 6N HC1 hydrolysis (110*) of 
purified analog-^" proinsulln. Cysteine was quantified by 
separate determination of cysteic acid on performic acid 
oxidiied sample and calculated by cysteic/al anine ratio. 
Values increased to compensate for acid decomposition were 
serine (10 percent) and threonine (5 percent). H.O. ■ not 
determined. 



Aalno Add 
predicted 
Ala 
Arg 
Asp 
Cys/2 
Glu 
Gly 
His 
lie 
Leu 
Lys 
Met 
Phe 

■To 
Ser 
Thr 
Trp 
Tyr 
Val 



aa mole percent x 57 

1.19 
4 .06 
3.17 
4.87 
6.95 
5.37 
2.03 
1 .39 
5.9S 
2.02 
0 

3.03 
1.96 
4.21 
3.09 
K.D. 
3.99 
3.74 



amino acids 

1 
4 
3 
6 
7 
5 
2 
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4. Folding Of Analog "C" Proinsulin 

Analog "C" prolnsulln folding to obtain a crosslinked 
for* was accomplished by reaction of 4 «g of protein containing 
about 30 percent analog "C" prolnsulln sulfonate. The protein 
was dissolved in a degassed buffer of 40 «M glycine, pH 10.6. 3 
14 urea. 0.3 H MaCl at 0*. To this was added t-mercaploe than ol 
to a concentration of 0.4 mM and the reaction sealed unoer 
N 2 . The course of the reaction was followed by measuring the 
increase in RIA activity and was complete within about four 
hours. The reaction* was stopped by the addition of acetic acid. 

Th.e reaction mixture was purified by HPlC by 
prep-collection from a C-18 Ultrosphere column. The resolving 
gradient was 21-28 percent acetonitrlle in 0.2m ammonium 
sulfate. 50 «H NaOAc. pH 4.0. at 0.2 percent/min and 1.0 ml/min. 
5. Assay For Analog "C Prolnsulln 

Analog "C" proinsulin has no crossreac ti vi ty in the 
insulin radioimmuno assay as the S-sulfonate. However, the 
formation of analog proinsulin as an expression product was 
confirmed by crossreacti vi ty of the thiol form. The samples of 
unknowns were treated for two minutes at 90*C with 1 mM 
B -mercaptoethanol. diluted into R1A buffer (0.1 M sodium 
phosphate. pH 7.4. 0.15M NaCl . 0.1 percent HaN 3 1 . 0.1 percent 
gelatin) and immediately assayed. Reproduceabi H ty depends 
upon strict timing since extended incubation of the diluted, 
reduced test solution leads to variable oxidative folding of 
the molecule into'forms with higher RIA activity. 

The thiol form gave an activity of 0.9 percent compared 
to insulin's 100 percent. By comparison, bovine proinsulin had 
RIA activity of 7.3 percent of insulin activity and the thiol 
form of bovine proinsulin had an activity of 1.9 percent. The 
third form of the B chain of porcine proinsulin had activity of 
0.1 percent. On incubation with a slight excess of 
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..»ere.ptocth*nol to cduH the folded, erosOmed product, 
reectlo.. «ixt«re t »how RU .ctlvUy of 20-40 percent th.t of 

insulin. 

c! Preparation of Human Insulin 

1. Folding and Linking of A and B Chains. 

The human prolnsulin or analogs thereof, prepared in 
accordance with the present invention, for example, following 
the procedures of Parts A or B above, as their respective 
S-sulfonates are Induced to fold with proper formation of 
internal disulfide bonds (between cysteine ? B and cysteine 
7 A, between cysteine^ B and cysteine^ A. and between- 
cysteine 6 n A ) by means, of controlled sulfhydroxyl 
interchange catalyzed by 8-mercaptoethanol . 

To a 0.1 mg/ml solution of prolnsulin S-sulfonate in 
' degassed 5C «K sodium glycinate, pH 10.6, at 4'C was added 
fl-nercaptoethanol to a final concentration of 0.3 dK. After 
four hours, the reaction is essentially complete as measured by 
the increase In cross-reacting activity of the mixture in 
insulin R1A. The yield of prolnsulin is about 80 percent. 
Prolnsulin is then purified from side products by gel 
permeation, ion exenange, and/or reverse phase high pressure 
liquid chromatography to yield product in substantially 
puri f i ed form . 

2. Excision of Bridging Chain 

The human prolnsulin or analogs thereof, prepared in 
accordance with the present Invention, for. example, following 
the procedure of Part C. 1. above, are proteoly tica 11 y converted 
to insulin for example in accordance with the procedure of 
«eamler et aV. . J. Biol Chem . . ,2_46 . 6786 (1971 ). The obtained 
insulin 1s then purified by column chromatography or zinc 
crystallization to yield product In substantially 
form, Identical to natural human insulin and freed of 
biologically active contaminants. 
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While the Invention In its most preferred embodiment ti 



described with reference to C. col 1 . other mi cro rgan 1 $m$ could 
likewise serve as host cells, for example, yeasts such as 
^cchjronvc» r »r»»ma» . Bacilli such as Bacillus sub;1Hs and 
preferably other en terobac terl aceae among which may be 
mentioned as examples Salnonel la tvphtr.yrfua and SfTUla, 
marcesans . utilizing plasnids that can replicate and express 
heterologous gene sequences 1n these organisms. The 
Invention Is not to be United to the preferred embodiments 
described. 
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CLAIMS : 

1. A process for producing a chimeric polypeptide 

compr ising : 

a) the polypeptide sequence of a ptoinsulin comprising 
the A and B chains of human insulin connected by a bridging 
chain of at least 2 amino acid units, said bridging chain 
having sites at each end which permit its excision from between 
said A and B chains; and 

b) an additional protein or protein fragment; 
there being a cleavage site at or adjacent said additional 
protein or fragment and adjacent one end of the polypeptide 
sequence of said proinsulin; 

comprising the steps: 

1, inserting a gene coding for said proinsulin into a 
m icrobial cloning vehicle in which the gene is in reading 
pnase with a DNA sequence coding for said additional protein, 
or fragment comprising said cleavage site; 

2, "transforming said cloning vehicle containing said 
inserted gene into a microbial host for expression of said 
chimeric polypeptide; 

3) expressing the chimeric polypeptide; and 

4) isolating the expressed chimeric polypeptide. 

2 . A process according to claim 1 wherein the amino acid 

sequence of the bridging chain corresponds to that of the C 

peptide of human proinsulin. 
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3 . A process according to claim 1 wherein the amino acid 
sequence of the bridging chain is Arg-Arg-Gly-Ser-Lys-Arg . 

4 . a process according to claim 1 wherein the amino acid 
sequence of the bridging chain is Arg-Arg. 

5 . A process according to claim 1 or claim 2 wherein the 
sites permitting excision of the bridging chain are the amino 
acid units Arg-Arg at the B chain end and Lys-Arg at the A 
chain end. 

6. A process according to any one of claims 1 to 5 wherein 
said cleavage site is a methionine unit. 

7. A process according to claim 6 wherein the methionine 
unit is adjacent the N-terminal of said proinsulin. 

8. A process according to any one of the preceding claims 
wherein the additional protein or fragment is either 

a) at least a substantial portion of 6-galactosidase; or 

b) a fragment of the trp leader polypeptide fused to 
a portion of the trp E polypeptide. 

9. A process for producing a proinsulin comprising the 
process of any one of the preceding claims and the additional 
step of cleaving said chimeric polypeptide to release said 
proinsul in . 

10. A process for producing a protein comprising the process 
of claim 9 and t.he additional step of excising said bridging 
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chain from said proinsulin. 

11. A process according to claim 10 wherein 

the excision of the bridging chain is preceded by the 
formation of disulfide bonds between the A and B 
chains and the product of excision is human insulin. 

12. A process according to any one of the preceding 
claims wherein step (2) affords a viable culture of 
microbial transf ormants containing the cloning vehicle, 
which culture is cultivated to perform step (3), wherein 
step [A) comprises separating the resulting cellular 
mass; and isolating a said chimeric polypeptide from it. 

13. A method of producing human insulin comprising: 
1) cultivating a viable culture of microbial 

transf ormants containing a cloning vehicle suited for 
transformation of a microbial host and use therein for 
expressing a chimeric polypeptide comprising: 

. a) the polypeptide sequence of a proinsulin 
comprising the A and B chains of human insulin connected 
by a briding chain of at least 2 amino acid units, said 
bridging chain having sites at each end which permit 
its excision from between said A and B chains; and 

b) an additional protein or protein fragment; 
there being a cleavage site at or adjacent said 
additional protein or fragment and adjacent one end of 
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the polypeptide sequence of said proinsulin; 

2) separating the resulting cellular mass; 3) isolating 

the precursors to human insulin comprising the chimeric 

polypeptide; 4) cleaving the additional protein 

therefrom; 5) effecting folding and linkage of the 

A and B chains; and 6) excising the bridging chain. 

14. A method according to claim 13 wherein the cloning 
vehicle is plasmid pHl7. 

ISA method according to claim 13 wherein the cloning 
vehicle is plasmid pBCAS. 



